Your article claims that the models cannot be reverse-engineered to reveal that the training data. But it seems there are cases where researchers have done exactly that: Anyone who shares data this way will need pretty strong assurances that the models generated don’t contain data residue within them, and that seems a difficult thing to do. Imagine that an insurance adjustment algorithm (which didn’t have direct access to medical records) begins querying the system the API of a google-like mechanism (one that did have direct access to the private data). With enough API queries, perhaps that insurance system could learn a great deal from this data that might give it insight into medical risks, or even characterize profiles of people who suffer from rare diseases/combinations. Have we now in effect shared this private medical data with the insurer’s algorithms also? We should also considering what “reading messages” really means — just because a human doesn’t read it, doesn’t mean it’s not being read. If an AI learns from a message, I’d suggest it’s “reading” it also, particularly if that AI is highly intelligent. Machine learning has tremendous potential to give us new insights and understanding, but we also have to be vigilant about the purposes for it’s used, and what is done with the algorithms that have learned from it afterwards.

Technology can augment human capability.