Follow

I want to download websites so I can access it offline, with preserved navigation. Let's say a few hundred articles/pages.

Any suggested tool/method? To be done on a Linux machine, preferable with opensource tool.

Thank you for your answers to my Linux open source software question at the top of this thread.

Mastodon/Fediverse is great, so nice we humans on earth can communicate and support each other.

Show thread

@glitchcake Thanks, I just executed a first trial of a site, result looks correct and what I wanted to do.

@scott Thanks, will have a look at that tool as well. Just tried HTTrack I also got suggested.

@dalstroka Njae, inte för detta ändamål. Men annars gillar jag RSS och använder det frekvent.

@hehemrin

`wget -r -np -nc -w.5 example.com`

adjust the .5 depending on how much you respect the hoster

@hehemrin Många servrar har lite olika typ av skydd mot upprepade, systematiska försök att kontakta dem. Min erfarenhet är att det bästa sättet är att skriva ett script i Python som använder `requests`-biblioteket för att göra nedladdningar. Då har du full kontroll över vad du laddar ner och när.

@mekuso Tack för tips och "varning". Inte tänkt på det, men det är ju rimligt med tanke på alla ddos-problem. Jag testade HTTrack som en annan föreslog på en sajt jag var intresserad av, där funkade det bra. Testar just nu på min egna sajt, den tar lång tid, får se om den slutför eller om hotellet skyddar mig från mig själv...

@rspfau Yep, got just tried it and it works for one site I was interested in to have available offsite.

@pinganini If you go the first message in this thread ("I want to download..."), and start from there, you can find several answers. The reason you do not see them directly is likely because my entry you respond to was an entry I wrote in response to my first entry, not to the last in the thread. Bu go up and start from the first one and you should be able to see answers.

@wydow Thank you for the suggestion, I've not heard of it before. To the description, it seems not be best fit for the use case I'm thinking about. My use is an offline copy of a complete website, with its navigation and all. I have tested one other suggestion, HTTrack, and for my trial it worked well, but maybe other tools may work better on some sites, to be explored further. Thanks for your suggestion!

@breizh Thank you for your suggestion. Quick reading, I don't think it fits to my idea this time. My need is to have a complete offline copy (snapshot) of website(s), to browse the site locally on my machine as if I was using the online version. Thanks for your response!

Sign in to participate in the conversation
Librem Social

Librem Social is an opt-in public network. Messages are shared under Creative Commons BY-SA 4.0 license terms. Policy.

Stay safe. Please abide by our code of conduct.

(Source code)

image/svg+xml Librem Chat image/svg+xml