Organisations are banning employees from using content generated with ChatGPT as data privacy and security issues continue to surge.
In March, a group of tech industry leaders distributed a letter pushing to delay AI research until legal, compliance and scientific institutions catch up with the latest developments and properly evaluate possible technological and societal implications.
"The advancements of AI and machine learning (ML) technologies have been rapid recently, creating an arms race among technology giants,” said Vaidotas Šedys, Head of Risk Management at Oxylabs.” In such circumstances, it is no surprise that proper evaluation of the legal, ethical, and security implications is still lacking.
With the latest architecture behind ChatGPT – the application based on the large language model GPT-4 – being introduced only recently, there is too little information about the data it has been trained on. It is also unclear what type of information it might be storing when interacting with individual users, apart from usage statistics, and there are many legal and compliance risks that such obscurity creates.
“Individual employees might leak sensitive company data or code when interacting with popular generative AI solutions. While there is no concrete evidence that data submitted to ChatGPT or any other generative AI system might be stored and shared with other people, the risk still exists as new and less tested software often has security gaps.”
So far, OpenAI has hesitated to release more information about how it handles the data that ordinary users submit to the system. Leaking confidential code fragments is a sizable risk with using free generative AI solutions at work. Many organisations face challenges while mitigating this risk, as it requires constant monitoring of employee activities and getting alerts when they are using platforms such as ChatGPT or GitHub Copilot.
Vaidotas continued, “Further risks include using wrong or outdated information, especially in the case of junior specialists who are often unable to evaluate the quality of the AI’s output. Most generative models function on large but limited datasets that needs constant updating. These models have a limited context window and can experience difficulties when dealing with new information. OpenAI admitted that their latest framework, GPT-4, still suffers from hallucinating facts.”
Companies such as Stack Overflow, one of the largest developer communities, have led the charge by temporarily banning the use of content generated with ChatGPT due to low precision rates that result in the disinformation of the users looking for coding answers.
“Using free generative AI solutions also poses a risk of legal sanctions, with GitHub’s Copilot already being accused and sued for using copyrighted code fragments from their public and open-source code repositories. As AI-generated code can contain proprietary information or trade secrets belonging to another company or person, the company whose developers are using such code might be liable for infringement of third-party rights. Moreover, failure to comply with copyright laws might affect company evaluation by investors if discovered.”
Vaidotas concluded,” Total surveillance at the workplace is neither desirable nor a feasible solution, as organisations are unable to observe every employee. Therefore, individual awareness and responsibility is crucial. Steps must be taken to educate the general public about the potential risks of using generative AI solutions. As for now, it is unclear who can copyright or own AI-generated works. Although many questions remain unanswered, companies must identify ways to mitigate the risks.”