By Stew Magnuson

iStock illustration

In the more than two years since ChatGPT exploded into public consciousness, defense acquisition professionals have been tinkering with it and other generative artificial intelligence systems to see how they can help them with the rigors of their jobs.

Generative AI allows defense acquisition personnel to organize unstructured information such as after-action reports, research, white papers, rules and regulations and contracts, according to Alexis Bonnell, chief information officer and director of the digital capabilities directorate at the Air Force Research Laboratory.

“It’s all of the things that you know are not structured but are critical to our function and our ability to maintain advantage. So, what becomes critical now with generative AI is — for the first time in human history — we actually can put all of the knowledge in play,” she said during a webinar organized by the Acquisition Innovation Research Center, a Defense Department-funded organization located at the Stevens Institute of Technology in New Jersey.

Acquisition professionals are looking at generative AI to “reduce their pain,” she added.

“What are the things they hate doing, or just hard or are a time suck? … What takes too much time? What’s the toil and how do I reduce that?

“So now that I I’ve kind of taken the pain, I’ve organized my relationship with knowledge, now I can start to optimize my workflow, my process, what I might do differently,” Bonnell said.

Douglas Schmidt, then-director of operational test and evaluation in the Office of the Secretary of Defense, said even though generative AI has been out in the public for more than two years, acquisition personnel still aren’t well versed with it — and even those who have become acquainted with it are often not using it as effectively as they could.

“It’s getting people aware of what’s possible. It’s getting people to start kicking the tires and trying this stuff,” said Schmidt, who since the webinar has left the Pentagon and returned to academia. “One of the challenges we have is trying to get people who are going to be users to come up to speed as rapidly as possible on the patterns and best practices in order to be able to get these large language models and other generative AI tools to do our bidding.”

In late January, OpenAI unveiled ChatGPT Gov, a new tailored version of ChatGPT designed to provide U.S. government agencies with an additional way to access the company’s frontier models.

AI can help the government strengthen national security, an Open­AI press release stated. “We believe the U.S. government’s adoption of artificial intelligence can boost efficiency and productivity and is crucial for maintaining and enhancing America’s global leadership in this technology.”

Generative AI tools like ChatGPT can help speed up defense acquisition systems, often criticized for being too slow, in several ways, the experts said during the webinar.

John Robert, deputy director of the software solutions division at Carnegie Mellon University’s Software Engineering Institute, said the technology can be an “accelerator” in areas such as prototyping.

“It can help bring ideas to the forefront very quickly. So that means generative AI can help accelerate the creation of a software prototype or a software model very quickly,” he said.

The technology can also verify if a software can perform certain tasks and help connect it to other systems, he said.

“These systems that we have today, they’re not isolated, they’re connected to so many different other systems. And one of the most common aspects is trying to connect software to different systems,” he said.

A second area is software verification, “an area where people don’t want to spend as much time because it’s very cumbersome or very time-consuming,” he said. “Software verification involves a lot of testing, and generative AI is excellent helping [create] test cases, but it can also be very helpful for creating modeling and doing some data analytics related to software verification.”

Another time-consuming task is regulatory and policy compliance, Robert said.

“We live in the acquisition community with a lot of regulatory documents, PDF documents, Word documents of all shapes and forms, and a lot of the human-intensive activity is about comparing the results from a current system or acquisition to those regulatory documents to see if there is consistency or inconsistency,” he said. “I think it turns out that generative AI is very good at handling [information technology] information. For example, here’s a regulatory document and here’s some information, and then asking it to highlight areas where things are not consistent or where there may be gaps.”

Bonnell said another potential time-saver for a cumbersome task that occurs early in the acquisition process is market research.
Asking the generative AI system if it knows of previously developed technology can provide awareness of the alternatives before going down the road of building something entirely new, she said.

Schmidt said that generative AI tools are now being applied to annual reports to assess trends or areas of concern.

“Having the ability to use generative AI at scale to do this type of analysis is tremendously useful for decision-makers because it gives them a more holistic view,” he said.

Heather Wojton, director of the operational evaluation division at the Defense Department-funded Institute for Defense Analysis, said another area where generative AI can reduce time and workloads is test and evaluation.

The trend today is to insert the warfighters who will be using the weapon systems into the process as early as possible for so-called “touchpoints,” and generative AI models can help give operators an idea of what the actual interface they’ll encounter will look like, “and we can start to do user testing much earlier,” she said.

Instead of waiting until much later in the acquisition process when developers have fully actualized interfaces, they can start to create these mockups much more quickly and iterate through them and then apply the collective knowledge about what makes a good interface, she said.
Generative AI can also help the test and evaluation community create scenarios for assessing new equipment, “which is very exciting,” Wojton said.

“In fact, we’ve started to see a little bit of this happen in our operational testing already,” she added.

Schmidt asked how the Defense Department should balance the need for rapid adoption of generative AI with the need to apply rigorous testing to prove and validate AI-enabled systems. “Hallucinations,” or AI-created errors, are one potential pitfall.

“The department should start small and then scale up. So, use generative AI in low-risk, less critical areas first, right?” Schmidt said. “Think about the mundane things that you’re doing in your day to day. If there’s a way for you to use AI to help with your calendar, for example, please do, right? And then we can scale up from there into mission areas where things are going to, you know, get more dicey.”

While AI tools typically work fairly well on one document or a couple of documents at a time, officials are not really able to do analysis of all the annual reports from the past 30 or 40 years because the technology doesn’t yet have the capability. That’s an area where more research and development is needed, he added.

The test and evaluation community is going to be there along the way, learning about generative AI and how to test it and to characterize performance, right along with the departments, Schmidt said.

“I think that’s exciting, and we shouldn’t wait for the perfect system to do that. We should jump in and get our feet wet and engage in a lot of integrated testing. We want to see testing across the full development life cycle of these systems, so that we’re intimately familiar with them, and that we’re testing all along the way,” he said.

Generative AI systems still make mistakes, and being able to use them in trust-but-verify protocols is going to be crucially important for the short- and mid-term as the technologies transition to widescale use in the defense acquisition community, he said.

The test community should be viewed as more of a partner than a watchdog on the performance of the generative AI systems, he added.

Another potential pitfall is security. The data collected from prompts entered in generative AI tools is then used for training, “which isn’t necessarily bad in general, but it’s very risky for the DoD and other people who don’t want their information out in the wild,” he said.

While some open-source tools can create more privacy, “the downside is those models are often not as sophisticated, and so users tend to see a lot more hallucinations, especially if they don’t use them properly, Schmidt said.

Generative AI, he concluded, “is not about trying to replace people. It’s not trying to supplant people. It’s about enhancing and supplementing and enriching — but we still have to have the humans in the loop, so it’s about getting people to realize these tools are quite effective when used properly.” ND

Source: Nationaldefensemagazine.org

Leave a comment

Trending