It is unfortunately more complex than that: the output is based on what the algorithm generates given the data on which it was trained and also given your input, if any is required. So if the train data set contains copyrighted material, that’s what it will potentially use for the output.

“Fixing the copyright issue” means that you would either need to feed the algorithm with data based on a free-to-use licence (e.g. creative commons), or have the algorithm check the output for similarity with copyrighted material and “punish” it when it gets too close.
The former reduces the quality of the AI’s output because there is less free data to train on, while the latter is not foolproof because it suffers from the lack of definition (how close is too close?) and the quality of the underlying database.

1 Like