**Reality 2.0 Podcast** @reality2cast@reality2.social · Dec 20, 2022, 17:40

**Reality 2.0 Podcast** @reality2cast@reality2.social · Dec 20, 2022, 17:40

Reality 2.0 Podcast @reality2cast@reality2.social

Dec 20, 2022, 17:40

Reality 2.0 Podcast @reality2cast@reality2.social

New episode is out! @dsearls and @katherined talk to @eze_lanza and Tony Mongkolsmai about #ChatGPT, generative #AI, and #opensource software.

Episode 135 - Experts Weigh in on ChatGPT. Listen here: https://www.reality2cast.com/135
#podcast #newEpisode #tech

95b1ef1946197223.mp4

**Elias** @eliasr@librem.one · 2023-01-03T23:23:12Z

Elias @eliasr@librem.one

@reality2cast

Hi, thanks for an interesting discussion.

Here is one question I think you missed: for that kind of AI system that is "trained" on data from the internet, suppose now that in the future the "new" content added to web pages will be increasingly generated by such systems. Then when the systems are to be improved by training on the new added content, they are increasingly taking in their own earlier output as new training input.

1/?

@dsearls @katherined @eze_lanza

Jan 03, 2023, 23:23 · Web · · ·

**Elias** @eliasr@librem.one · Jan 03, 2023, 23:28

**Elias** @eliasr@librem.one · Jan 03, 2023, 23:28

Jan 03, 2023, 23:28

Elias @eliasr@librem.one

@reality2cast

Will the systems still improve in that situation, or will things stagnate because of lack of real input from real human beings?

I guess that if you think there is no problem with that, then you could already construct such a feedback loop where the system eats its own output as input. Does that work, or not?

Will "original content" created by actual humans capable of original thought become increasingly rare and increasingly valuable?

2/2

@dsearls @katherined @eze_lanza

**Eze Lanza** @eze_lanza@mstdn.ca · Jan 04, 2023, 01:05

**Eze Lanza** @eze_lanza@mstdn.ca · Jan 04, 2023, 01:05

Jan 04, 2023, 01:05

Eze Lanza @eze_lanza@mstdn.ca

@eliasr @reality2cast @dsearls @katherined Hey good point! I see it as when used in scenarios like computer vision, where you can create synthetic data to increase your dataset size (data augmentation). It's also useful when you have a very little amount of data to train your model, it can help on the training task when you are in this situation and it works well. An important disclaimer here is that the models trained with synthetic data has lower accuracy than those trained on real data 1/2

**Eze Lanza** @eze_lanza@mstdn.ca · Jan 04, 2023, 01:08

**Eze Lanza** @eze_lanza@mstdn.ca · Jan 04, 2023, 01:08

Jan 04, 2023, 01:08

Eze Lanza @eze_lanza@mstdn.ca

@eliasr @reality2cast @dsearls @katherined
Since the models are tested with real data, you can test how well the synthetic data worked to improve your system. In my experience it's pretty useful but it has to be either validated before training the model (check data) or after training manually or with a validation system. Mmmm maybe this is what chatGPT is doing when we provide the feedback?
2/2

**Katherine Druckman** @katherined@librem.one · Jan 04, 2023, 01:10

**Katherine Druckman** @katherined@librem.one · Jan 04, 2023, 01:10

Jan 04, 2023, 01:10

Katherine Druckman @katherined@librem.one

@eze_lanza @eliasr @reality2cast @dsearls Yeah, I think you are right about user feedback.

Jan 04, 2023, 01:11

Jan 04, 2023, 01:11

Jan 04, 2023, 01:11

clacke: exhausted pixie dream boy 🇸🇪🇭🇰💙💛 @clacke@libranet.de

@eliasr @katherined @dsearls @eze_lanza @reality2cast This is a crisis already for translation software. Some people (are OpenAI doing this?) do research on identifying bot-generated content precisely for the purpose of bots not training on themselves.

**Eze Lanza** @eze_lanza@mstdn.ca · Jan 04, 2023, 01:13

**Eze Lanza** @eze_lanza@mstdn.ca · Jan 04, 2023, 01:13

Jan 04, 2023, 01:13

Eze Lanza @eze_lanza@mstdn.ca

@clacke @katherined @eliasr @dsearls @reality2cast

This is sure to be a hot area of research for years to come!

Jan 04, 2023, 01:26

Jan 04, 2023, 01:26

Jan 04, 2023, 01:26

clacke: exhausted pixie dream boy 🇸🇪🇭🇰💙💛 @clacke@libranet.de

@eze_lanza @eliasr @katherined @dsearls @reality2cast Almost existential. The feedback loop of bots training on themselves and people then listening to the bots could really spin out of control.

**Katherine Druckman** @katherined@librem.one · Jan 04, 2023, 01:26

**Katherine Druckman** @katherined@librem.one · Jan 04, 2023, 01:26

Jan 04, 2023, 01:26

Katherine Druckman @katherined@librem.one

@clacke @eliasr @dsearls @eze_lanza @reality2cast That's very interesting. I could see how self-training on translation might eventually create a new language or dialect. And now I want to go ask ChatGPT to create a new spoken language and see what happens. :)