Backing Up FTO Reports

Sat Jul 08, 2017 2:43 pm

The demise of the EpicSki Forum on first 3 days, then 2 weeks notice prompted a frenzy of activity to download and save content. viewtopic.php?f=10&t=12408#p77562

I have now tested the software on FTO. It was successful and far less time consuming than for Epic. Part of this is that I already have a list of URL's for all TR's I have posted on FTO. It would/will be more time consuming to chase down other threads I might want to save, a similar process as I did with Epic before May 12.

The first screen of WinHTTRack prompts a project name and a directory where you want the files stored

In the left frame you can see my FTO Backup directory and the seven backup projects I ran

The next screen has a space for you to copy in the URL's you want to save in the current project, shown in blue here.

It looks like there's room for only a few URL's, but that box will take as many as you want.

Then click the "set options" button. Under the "Limits" tab I raoised the max transfer rate from the default 25,000 to 250,000.

At this point with Epic I hit "Next" then "Finish" and the program ran its course. FTO proved to be more problematic, as the program failed within 30 seconds and downloaded nothing. I grazed online to see if anyone else had a similar problem. Under "set options" you need to go to the "Browser ID" tab. FTO rejects the default, so you must choose the first one on the drop screen, referring to MSIE 6.0, shown in blue here:

WinHTTrack scrapes a huge amount of data, enough for your download to appear exactly as it does online. This means it will include frames, banners, ads, graphics, pictures, and linked URL's. The Epic downloads would run for 4=8 hours. The FTO downloads never stopped, so after 8 hours I interrupted, hit the tab to "finish current downloads" and the program would run for just a few minutes more.

Sat Jul 08, 2017 2:46 pm

And uses a metric crap ton of our bandwidth.

Sat Jul 08, 2017 3:12 pm

To no surprise there is a morass of directories and subdirectories within each project. But if you open the index file you get a list of your downloaded reports that you requested.

Click on one and you will see, stored on your C drive, the report exactly as it appears on FTO. Click on any picture, view image info, and you SHOULD see that the image has also been downloaded and stored to your C drive.

On the Epic download, HTTracks had some problems with threads in excess of 9 pages. Extra pages LOOKED LIKE they were downloaded but in reality were linked back to the online version. You COULD get those threads by downloading them just a handful at a time but it lenghtened the process quite a bit. Thankfully FTO has very few threads where length is excessive, and none on my TR's as far as I know.

On Epic, the URL's on the index contained the title name, so it was relatively easy to find something as long as you know which download project to check. FTO's URL's are opaque. In my case I have an enumerated list on a spreadsheet with the date and topic in other columns, so this is less of a hindrance to me than it might be to other people.

At this point I'm generally declaring victory. All of my FTO TR's including pictures are now on my C drive and not at the mercy of software platform changes, administrator error or indifference, etc. Patrick took this step nearly a decade ago, and the recent Epic debacle demonstrates that he was prescient in that regard.

Patrick told me at Mammoth that he has not reloaded all of his saved TR's to his blog. This can be a very tedious job, though obviously there is no deadline or time pressure to complete it. But given the morass of files that HTTrack scrapes, I was curious to run a test case of cleaning up one TR. I am pleased to report that the task is fairly efficient. Under the project name directory, there is a http://www.firstracksonline subdirectory. The core html file for each report is there. I open that file in Notepad++ and strip out the extraneous code used for frames, ads, banners, etc. Nearly all of this is separate from the text I'm trying to save.

The tedium of this process is usually involved in making the pictures display. If you download an FTO file manually, the picture links will all be back to the website and not to your C drive. This forces you to edit the html code manually for each picture. With HTTrack, the pictures are all in a download subdirectory of the http://www.firstracksonline subdirectory. So I created a download subdirectory under the directory I was using elsewhere in my C drive, then copied all the pictures from HTTracks' download subdirectory into it. Reloading my scrubbed version of the TR html file now displayed all the pics just fine. You don't even need to hunt down the specific pics for each TR. If they are in the proper "download " subdirectory, they will display.

Sat Jul 08, 2017 3:17 pm

Admin wrote:And uses a metric crap ton of our bandwidth.

:-({|= :-({|= :-({|=
Compared to the thousands of hours of work that would be lost if FTO went the way of Epic, I have zero sympathy. The entire download process for all of my FTO TR's dating back to 2001 was completed between July 5 and this morning. At some point I will download enumerated threads of some other FTO topics, but that's likely to be fairly minor compared to the past three days.

And I am still reminded of upcoming platform changes to FTO every time I have to strip out this crap
" onclick=";return false;" onclick=";return false;" onclick=";return false;
when editing a post.

Sat Jul 08, 2017 3:30 pm

Send your donations for bandwidth fees to:

PO Box 71171
Cottonwood Heights, UT 84171-1171

Thu Jul 13, 2017 12:54 pm

It's just possible that Tony and Patrick are the only ones who care about this.

Thu Jul 13, 2017 2:22 pm

Marc_C wrote:It's just possible that Tony and Patrick are the only ones who care about this.

The archives are clearly the most important feature of FTO during the 7-8 months the News is silent because we have, what, maybe 10 active users left?

At any rate, Patrick is 100% correct that if you want your own content preserved, don't depend on someone else's server/platform for that.
